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We implement an adaptive mesh algorithm for calculating the space and time dependence 
of the atomic density field during materials processing. Our numerical approach uses the 
systematic renormalization-group formulation of the phase field crystal model to provide 
the underlying equations for the complex amplitude of the atomic density field — a quantity 
that is spatially uniform except near topological defects, grain boundaries and other lattice 
imperfections. Our algorithm is a hybrid formulation of the amplitude equations, combining 
Cartesian and polar decompositions of the complex amplitude. We show that this approach 
leads to an acceleration by three orders of magnitude in model calculations of polycrystalline 
domain formation in two dimensions. 

PACS numbers: 81.15.Aa, 81.16.Rf, 46.15.-x, 05.10.Cc 

I. INTRODUCTION 

A fundamental theoretical and computational challenge in materials modeling is that of concur- 
rently treating phenomena over a wide range of length and time scales. For example, in studying 
the mechanical response of polycrystalline materials, one must take into account the dynamics 
and interactions of vacancies, impurities, dislocations and grain boundaries, on time scales ranging 
from atomic vibrations to system-wide diffusion times. 

Numerous approaches to handling the wide range of length scales have been proposed [1], 
including quasi-continuum methods [2-5], the heterogeneous multiscale method [6, 7], multi-scale 
molecular dynamics [8-11], multigrid variants [12] and phase field models [13-16]. In general 
one can classify different techniques as being either atomistic or continuum, and differentiate them 
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further by the characteristic time scale: density functional theory (DFT), for a quantum mechanical 
description of processes at the atomic time scale; molecular dynamics (MD) or Monte Carlo (MC) 
methods, appropriate for collective dynamics; and coarse-grained descriptions involving continuum 
fields at the mesoscale on diffusive time scales. The difficulty of merging descriptions at different 
length and time scales limits the effective application of most of these methods. Lack of a continuous 
transition between scales can induce artifacts, such as spurious reflections in a transition region 
between two levels [7, 17]. Further, any method using molecular dynamics is typically restricted 
to sub-nanosecond time scales, whereas many interesting phenomena during materials processing, 
such as microstructural pattern formation, recrystallization, heat and solute diffusion, dislocation 
glide, etc., occur over time scales which are typically greater than 10~^s. 

One continuum approach that has been used successfully, especially in the multiscale modeling 
of solidification problems [18], is the phase-field method [13]. Through the effective use of asymp- 
totics [14] and adaptive mesh refinement [19, 20], the phase- field method has been used to span 
several orders of magnitude in length, from microns to centimeters. Extensions of the method by 
Kobayashi and co-workers [21, 22], and Warren [16] also make it possible to model polycrystalline 
systems. Special forms of the free energy that incorporate strain energy have been used to model 
the qualitative features of strain- induced phase transformations [23-28]. The phase-field method 
represents a coarse-graining in space to length scales much greater than those of the interfaces and 
defects of interest in this work. As a result, the kinetic coefficients that emerge in the final contin- 
uum equations are phenomenological, and can be related to experimentally-measurable parameters 
only after a suitable asymptotic matching of the phase field equations with corresponding sharp- 
interface models [18, 29, 30]. As such, traditional phase field models do not fundamentally embody 
the emergent kinetic and elasto-plastic mechanisms that originate at the atomic scale. Perhaps the 
most important limitation of phase field models is that, in general, they do not preserve any record 
of the underlying crystal lattice, so that ad hoc approaches must be used to model the variety of 
phenomena which result from lattice interactions. 

The phase field crystal (PFC) model [31, 32] is a promising extension of the phase field model 
approach, in which the equilibrium configuration of an atomic density field is constructed to be 
periodic, rather than uniform in space. The conserved dynamics of the PFC model then naturally 
reproduce many of the non-equilibrium processing dynamics arising in real polycrystalline materi- 
als. The PFC model is founded on the insight that a free energy functional that is minimized by 
a periodic field natively includes elastic energy, anisotropy and symmetry properties of that field. 
Thus the model naturally incorporates all properties of a crystal that are determined by symme- 
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try, as well as vacancies, dislocations, and other defects. Moreover, the PFC model represents the 
evolution of the system over a time scale that is much longer than the vibrational period of atoms 
(O(10~^^s)), but much shorter than the time scale of diffusive processes in the system, such as the 
viscous glide of dislocations, which typically occur over a time scale of ©(lO^^s). The PFC model 
yields a relatively simple and well-behaved partial differential equation (PDE) for the evolution of 
the time-averaged density, giving it access to phenomena occurring on atomic length scales, but 
over diffusive time scales. The PFC method is thus able to capture atomic-scale elasticity and the 
interaction of topological defects on the same time scales that govern diffusive processes during 
phase transformations in pure materials [32-34] and alloys [35] . 

As with any model that resolves at the atomic scale, the PFC model is limited in its ability to 
model systems of realistic dimensions, because the computational grid must resolve the periodicity 
of the field. For grid converged results, a minimum of 9 grid points per period are required. In a 
physical system, the periodicity represents interatomic distance, O{10^^^m). Thus, to simulate a 
system having a characteristic dimension of one micron would require about 10^ degrees of freedom 
per spatial dimension on a uniform computational mesh. This would be a heroic computation in 
2-D, and well beyond reach in 3D, even with the use of massive parallelization. Furthermore, the 
periodic lattice precludes the effective use of adaptive mesh refinement (AMR) algorithms. 

The first three authors have recently described an approach to overcome this difficulty [36, 37], 
using the perturbative renormalization group (RG) method [38, 39] to systematically coarse-grain 
the PFC equation [40]. The basic idea is to obtain renormalization group equations of motion 
for the complex amplitude of the periodic density field, a quantity whose modulus and phase are 
spatially uniform except near regions of lattice disruption, such as at grain boundaries and at 
topological defects. From the complex amplitude, it is possible to reconstruct the atomic-scale 
density field at least within the one-mode approximation, and to compute non-trivial materials 
properties and dynamics to high accuracy (within one percent) [36, 37]. This approach, which 
we will sometimes denote as the PFC-RG method, is much faster than solving the PFC equa- 
tion directly, because the complex amplitude varies on much larger spatial length scales than the 
density itself, thus permitting the use of an adaptively-generated coarse mesh over much of the 
computational domain[36]. It is important to appreciate that the equations of motion for the 
complex amplitude must be rotationally-covariant, in order that a polycrystalline material or het- 
erogeneous microstructure can be represented without any preferred orientations imposed; this is 
readily achieved using renormalization group methods [40]. However, in a practical computation, 
the reciprocal lattice vectors of the equilibrium crystal structure are represented within a particular 
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basis, and there is the potential for interference between the density Fourier components and the 
basis[36], giving rise to artifactual "fringes" or "beats" in the corresponding Fourier components. 
The overall density does not, of course, exhibit these interference fringes, but their presence in 
the individual components means that to be properly resolved, an adaptive mesh algorithm must 
generate grid refinement in their vicinity. As a result, efficient computation becomes compromised. 

The purpose of this paper is to develop a computationally-efficient formulation of the PFC-RG 
method that enables the implementation of an AMR algorithm up to micro- and meso- length 
scales, without being deflected by artifacts arising from the choice of basis set. The approach is 
to use a hybrid representation of the complex amplitude, switching between Cartesian and polar 
coordinates as appropriate in a seamless fashion to avoid beating and coordinate singularities. The 
resultant description is fast, accurate and provides mesh refinement and coarsening in the physically 
correct locations, without artifacts arising from choice of basis or other implementation-dependent 
details. As such, our work represents a first step towards providing a systematic description of 
materials processing using continuum fields across all relevant length scales. 

The remainder of this paper is organized as follows: We introduce the complex amplitude 
equations (interchangeably called the KG equations) in Section II and illustrate the interference or 
beat problem in the Cartesian representation of these equations that limits the effectiveness of AMR 
techniques. In Section III we introduce a polar formulation of the equations that addresses the 
problem of beats, but also exhibits coordinate singularities that make these equations unwieldy for 
numerical solution. We then present a new hybrid formulation in Section IV, which is a procedure 
for solving the Cartesian equations of Section II concurrently with a reduced form of the polar 
equations of Section III in different parts of the computational domain. In Section V the hybrid 
formulation of the RG equations is demonstrated to be amenable to solution using a new finite- 
difference-based AMR algorithm specifically developed for our RG equations. Section VI presents 
numerical simulations and results, including efficiency benchmarks that clearly demonstrate the 
computational advantage of our AMR-RG approach. Section VII concludes and presents directions 
for future work. 
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II. THE COMPLEX AMPLITUDE EQUATIONS 
A. Governing equations 

In the PFC model, the evolution of the density p is given by 



""-rv^f^'l+'J (1) 



dt \5p 

where is the free energy functional, which can be written as = J dfi^f (^p, V^p, •••)), where 
/ is the local free energy density, F is a constant and 77 is a stochastic noise with zero mean 
and correlations {r){f,t)r){r' ,t')) = —TkBTV'^6{f — r')5{t — t'). The specific form of !F is chosen 
such that at high temperatures is minimized by a spatially uniform liquid state, and at low 
temperatures by a spatially periodic "crystalline" phase. Furthermore, / must be chosen such that 
T is independent of crystal orientation. These constraints naturally incorporate both elastic and 
plastic deformations. 

A free energy form that satisfies these criteria naturally produces mobile regions of liquid / solid 
coexistence separated by free surfaces, i.e., phase transformations. Elastic energy and defects in the 
crystalline phase arise from the requirement that be minimized by a spatially periodic density 
field that is independent of crystal orientation. Elder et al. [31, 32] demonstrated these properties 
of the model for a variety of applications, including studies of grain boundary energy, liquid phase 
epitaxial growth and the yield strength of nanocrystalline materials. The particular model they 
used made the following choice for the function /: 

/ = p (a AT + A {ql + V^) p/2 + up^A (2) 

where a, A, Qq and u arc model parameters that can be specified to match some specific material 
properties, such as Young's modulus and lattice spacing [31, 32]. In order to discuss the dynamical 
behavior of the PFC model, it is useful to rewrite the free energy in dimensionless units: x = fqo, 



tp = p^/ujX^, r = aAT/Xq^, r = TXq^t and F = J^u/X^ql''^ so that 

F = J dx [i; {r + {1 + V^) i>/2 + 4>^/4] (3) 
In these units the conservation law of Eq. (2) becomes 

^ = V2[(r + (l + V^)^)^ + ^3]+C (4) 

where (C(n, ti)C(^2, r2)) = SV^6{ri - r2)(5(ri - T2) and S = ukBTqf-"^ /\^ . Eq. (4), introduced by 
Elder et al [31, 32], will be referred to as the PFC equation in what follows. This equation can be 
used in any dimension by simply introducing the appropriate form for the Laplacian operators. 
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The spatial density ip can be approximated in terms of the complex amplitude Aj as 

3 3 

^ « ^ Aje^^^-"" + J2 ^^e-*^-^ + ^, (5) 



where 

ki = feo(-iV3/2-j72) 

k3 = ko{iVs/2-J/2) (6) 

are the reciprocal lattice vectors of a crystal with hexagonal symmetry, and ko is the dominant 
wavenumber of the pattern. For all the calculations shown in this paper length has been scaled 
such that ko = 1, which corresponds to an interatomic spacing of oq = 27r/(\/3/2). The complex 
amplitude equations, which constitute a coarse-grained approximation to the PFC equation were 
shown in our earlier work [36, 37] to be given by 

^ = CjAj - 3Aj\Ajf - 6Aj J2 -Qi^U^l (7) 

k:k^j k:k^j 

where _7, A; G [1,3] and 

Cj = [1 - - 2i\ij ■ V] \-r - Sijj^ - (V^ + 2ikj ■ V}^1 (8) 

is a rotationally covariant operator. The superscript "*" denotes complex conjugation. The pa- 
rameters r (< 0) and "ip (> 0) control the bifurcation from a uniform liquid phase to a crystalline 
phase with hexagonal symmetry. Specifically, r is proportional to the temperature quench from a 
critical temperature Tc, while tp is the mean density in the system. We refer to this form as the 
Cartesian representation because the amplitudes are expressed along each coordinate direction. 

The rotational covariance of the operator C permits the incorporation of multiple crystal ori- 
entations using only the basis vectors in Eq. (6). To see this consider a density field ip defined by 
Eq. (5) with triangular lattice basis vectors kj(^) (where |kj(^)| = 1) that are rotated by an angle 
9 from the basis vectors kj in Eq. (6), i.e. 

3 3 

,p{e) = Aje^^^^^>'' + ^*e-^k. W-x ^ ^_ 
i=i j=i 

Equation (9) describes the density field of a grain misoriented with respect to the basis vectors. 
Writing the basis vectors as kj(^) = kj -|- 6]s.j{9), where the vector Skj{9) measures the rotation of 



7 



each lattice vector, we obtain 



3 3 



— ikj -x 



(10) 




or 



3 



3 



(11) 



where 



(12) 



Thus grains arbitrarily misoriented from the global basis kj can still be described in terms of kj by 
suitably representing the complex amplitude Aj in polar form according to Eq. (12). A straight- 
forward way to include differently oriented grains in the system is to specify an initial condition 
via Eq. (11). By making the amplitude a non-uniform complex function with a periodic structure, 
multiple grain orientations are automatically included. Fig. 1 illustrates this idea. Fig. 1(a) shows 
the real component of one of the three complex amplitude functions Aj, specified by Eq. (12), 
and Fig. 1(b) shows the corresponding density field constructed using Eq. (11). Since Eq. (7) is 
rotationally covariant, it allows these "beat" structures in the amplitudes (and therefore the cor- 
responding orientation of the grain) to be preserved as the system evolves, thereby enabling the 
representation of polycrystalline systems with a single set of basis vectors. 



A straightforward approach to solving Eq. (7) is to determine the real and imaginary parts of 
the complex amplitudes Aj directly, using the Cartesian definition. This leads to six equations 
that can be evolved concurrently using a suitable time integration scheme. The second order finite 
difference spatial discretizations of the Laplacian and gradient operators are given in the Appendix. 
This approach leads to limited success of AMR techniques because of the beats. 

To illustrate this effect, we simulated heterogeneous nucleation and growth of a two-dimensional 
film, randomly placing twelve randomly oriented crystals of initial radius Sir in a square domain of 
side 2567r with periodic boundary conditions. The largest misorientation angle between grains was 
6 = 7r/12. The amplitude equations in Cartesian form were solved using an adaptively evolving 
mesh (described in detail below). The model parameters were r = —0.25 and -ijj = 0.285, the 
smallest mesh spacing was Axmin = 7r/2, while the largest mesh spacing at any given time was 



B. Limitations of the Cartesian representation of Equation (7) 



(a) K(yli) (b) i> 

FIG. 1: (a) Real component of the complex amplitude Ai. As the grain in the bottom- left corner is aligned 
with the basis kj in Eq. (6) its amplitude is constant, while amplitudes of the remaining misoriented grains 
have "beats", (b) Density field i/j reconstructed using Eq. (11). Clockwise from the lower left corner, 9 = 0, 
7r/24 and tt/6. 

^^max = 2^(Axmm) Corresponding to 5 levels of refinement. On a uniform grid, this simulation 
requires 1025 x 1025 = 1,050,625 nodes with the PFC equation, and 513 x 513 = 263, 169 nodes 
with the amplitude equations. A time step of At = 0.04 was used. 

Fig. 2 shows the crystal boundaries and grid structure at various times during the simulation. 
The field plotted is the average amplitude modulus, X]j=i^j/'^- Although the grid starts out 
quite coarse {t = and t = 88) at several locations in the computational domain because of the 
large liquid fraction, this advantage falls off dramatically once the crystals evolve, collide, and 
start to form grain boundaries. In particular, once all the liquid freezes, only a few grains that 
are favorably oriented with respect to kj show any kind of grid coarsening at all. Those that are 
greatly misorientated with respect to kj lead to frequency beating, causing the number of nodes 
in the adaptive grid to increase rather than decrease. The polycrystal mesh shown in Fig. 2(f) 
has 219, 393 nodes, which is very near that on a uniform grid. Therefore the adaptive refinement 
algorithm applied to a Cartesian formulation of Eq. 7 gives at best a marginal improvement over 
a fixed grid implementation. The main purpose of this paper is to present a methodology for 
overcoming this problem. 
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(d)t=248 (e)t=320 (f)t=552 

FIG. 2: (Color online) Evolution of a polycrystalline film simulated with complex amplitude equations, 
Eq. (7), on an adaptive grid. Note that the grid does not coarsen inside many of the grains (misoriented 
with respect to kj ) because of the fine scale structure of the "beats" . The colored field plotted is the average 
amplitude modulus, which is "red" inside the crystal phase, "blue" in the liquid phase, "green" at the 
crystal/liquid interface, and "yellow" near defects. 

III. COMPLEX AMPLITUDE EQUATIONS IN A POLAR REPRESENTATION 

A. Governing equations 

We find that the computational benefits of AMR are potentially greater if, instead of solving for 
the real and imaginary components of Aj, we solve for the amplitude moduli = \Aj\, and the 
phase angles = arctan{Q{Aj) /?R.{Aj)), which are spatially uniform fields irrespective of crystal 
orientation. Together these two fields constitute a polar representation of Aj. 

In this section we derive evolution equations for and directly from Eq. (7), by applying 
Euler's formula for a complex number, i.e. Aj = ^^e**^, and then by equating corresponding real 
and imaginary parts on the left- and right-hand sides of the resulting equations. In this manner 
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we get the coupled system of equations, 



at 



+ 



(13) 



and 



at 



(14) 



where 



C^(*,-,$,) = 5i. 
C^(*,-,$,) = 9. 



[V2 + 2zkj- • V] (*je^*Ol 

J 

[V2 + 2ikj • V] (^je^*Ol 

[V^ + 2ikj • V] (C^(^j,^j)e^'^^) 



(15) 



and so on for the remaining Cs. Prom here on we refer to the evolution equations for and $j 
as the phase/amplitude equations, whereas Eq. (7) will be referred to as the complex amplitude 
equation. Unfortunately, the phase/amplitude equations in Eqs. (13) and (14) turn out to be quite 
difficult to solve globally. The principal difficulties are summarized below. 

The field is nearly constant within the individual grains and varies sharply only near grain 
boundaries, rendering its equation ideally suited for solution on adaptive meshes. The field 
on the other hand, if computed naively as arctan(9(^j)/5}?(^j)), is a periodic and discontinuous 
function [55] bounded between the values — tt and tt, with a frequency that increases with increasing 
grain misorientation. This poses a problem similar to that previously posed by the beats, with the 
grid this time having to resolve the fine scale structure of Further, one may need to resort 
to shock-capturing methods in order to correctly evaluate higher order derivatives, and resolve 
jumps where $j changes value from tt to — tt and vice- versa. Complications are also caused by $j 
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(c)Lineplot of along solid line in (a) (d)Lineplot of <l>i along dashed line in (c) 



FIG. 3: (Color online) (a) and (c): The field ^'i, smooth everywhere except near interfaces and defects, (b) 
and (d) The field $1, which is computed naively as arctan(3(Ai)/5R(yli)) is periodic and discontinuous. The 
chaotic fiuctuations in $1 in the regions outside the crystals correspond to the liquid phase where $1 has no 
physical meaning. The rapid, but periodic, variations of $1 in the left grain is due to its large misoricntation 
angle of 7r/6. In contrast, the grain on the right is oriented along kj causing <i>i to vary much more smoothly. 

being undefined in the liquid phase, and the tendency for ^j, which appears in the denominator 
on the right hand side of Eq. (14), to approach zero at those locations. This calls for some type of 
robust regularization scheme [56] for the phase equations. These problems are clearly highlighted 
in Fig. 3, which shows the impingement of two misaligned crystals and the corresponding values 
of ^1 and ^i. 

Ideally, one would like to reconstruct from the periodic a continuous surface + 2n7r (where 
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n is an integer) which would be devoid of jumps, and therefore amenable to straightforward reso- 
lution on adaptive meshes. The implementation of such a reconstruction algorithm however, even 
if possible, requires information about individual crystal orientations, and the precise location of 
solid/liquid interfaces, defects, and grain boundaries at every time step, making it very computa- 
tionally intensive. Further, such an algorithm would be more appropriate in the framework of an 
interface-tracking approach such as the level set method [41], rather than our phase-field modeling 
approach. 

Despite these issues with the polar (phase/amplitude) equations progress can be made, under 
certain non-critical approximations, by solving the phase/amplitude equations in the interior of 
crystalline regions, in conjunction with the Cartesian complex amplitude equations in regions closer 
to domain boundaries and topological defects. 

B. Reduced equations and the frozen phase gradient approximation 

The main idea that will be developed in this and subsequent sections is that of evolving the 
phase/amplitude and complex amplitude equations simultaneously in different parts of the domain, 
depending on where they can most appropriately be applied. The phase/amplitude formulation 
is solved in the crystal interior, away from defects, interfacial regions, and the liquid phase. The 
complex amplitude equations are solved everywhere else in the computational domain. This does 
away with the need for regularizing the phase equations where *j (since S> in the crystal 
interior) as well as the issue of the phase being undefined in certain regions. We overcome the 
remaining issues with the phase equation, i.e. the difficulty of evaluating derivatives of the phase 
and the need to resolve its periodic variations via certain controlled approximations described next. 

Let us examine the results of a fixed grid calculation performed using the complex amplitude 
equations, illustrated in Fig. 4, showing a sequence of line plots of the quantity A{d^i/dx) = 
5$i/5a:|j^g4Q — d^i/dx\f. inside the growing crystal is seen to be essentially time invariant. 
As the crystal on the left grows, it can be seen that A{d^i/dx) stays close to zero inside. We have 
verified that this is also true for the y component of V$i, and both components of V$2 and V$3. 

These results suggest that we may employ a locally frozen phase gradient. Note that the as- 
sumption of a frozen phase gradient does not mean that itself cannot change. $j can continue 
to evolve as per Eq. (17) under the constraint of a fixed V<I>j, although the changes may actually 
be quite small. On the other hand, when similarly oriented crystals collide to form a small angle 
grain boundary, it is energetically more favorable for grains to locally realign (i.e. for to 
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FIG. 4: (Color online) (f>i and its time evolution for the pair of crystals shown in Fig. 3. (a) Contours of 
d^i/dx a,t t = 520. (b) Lineplot of d^i/dx along solid line in (a), (c) A{d^i/dx) along dashed line in (a). 
Just as with "ifj, the components of V$j are also practically constant inside the individual crystals. The 
spike in (b) corresponds to a defect on the grain boundary. As seen from the time series in (c) for d^i/dx, 
V$j hardly changes in the crystal bulk during its evolution. 



change close to grain boundaries) in order to reduce orientational mismatch [42-45], rather than to 
nucleate dislocations. Since such interaction effects originate at the grain boundary, where the full 
complex RG equations will be solved, we anticipate that our assumption will not lead to artificially 
"stiff" grains. 

This approximation allows us to neglect third and higher order derivatives of and [57], 
which allows us to reduce Eqs. (13) and (14) to the following second order PDEs: 



dt 



dt 




(16) 
(17) 



^ '1 \ k / \ k 

where and contain only first and second order derivatives of ^ j and <I>j. Eqs. (16) and (17) 
are referred to as the reduced phase/amplitude equations. 

The task of evolving the phase/amplitude equations is now considerably simplified, as only 
derivatives up to second order in <I>j need to be computed. While the Laplacian and gradient 
of can be computed in a straightforward manner using Eqs. (Al) and (A5) respectively, the 
gradient of <I>j needs to be computed with a little more care (in order to avoid performing derivative 
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operations on a discontinuous function). The result is that 



'3 

Thus, the gradient operation on a discontinuous function is now transformed into gradient 



operations on the smooth components of the complex amplitude Aj. Further, V^<l>o is computed as 



^3 

V • V$j, where the divergence operator is discretized using a simple second order central difference 
scheme. 

However, as can be seen from Eq. (18), V$j now depends on gradients of the real and imaginary 
components of Aj , which may not be properly resolved in the crystal bulk as we intend to coarsen 
the mesh there. To address this point, we assume that V$j is frozen temporally in the crystal 
bulk. This assumption implies that once is accurately initialized in the crystal interior via 
Eq. (18), after ensuring adequate resolution of the components of Aj, it need not be computed 
again. For example, in simulations of crystal growth from seeds, we can start with a mesh that 
is initially completely refined inside the seeds, so that V$j is correctly computed. Once initial 
transients disappear and the crystals reach steady state evolution, the growth is monotonic in the 
outward direction. Prom this point on, V$j hardly changes inside the crystal bulk and the grid 
can unrefine inside the grains while correctly preserving gradients in V$j . Note that the apparent 
discontinuities in $j no longer need be resolved by the grid. 



IV. A HYBRID FORMULATION 



In order to implement our idea of evolving Eq. (7), and Eqs. (16) and (17) selectively within 
different regions, we begin by dividing the computational domain into two regions where each set 
of equations may be evolved simultaneously in a stable fashion. The region where Aj is computed 
in terms of its real and imaginary parts is called X, and the region where ^j and $j are computed 
is called P. We ensure that subdomain P is well separated from locations with sharp gradients, such 
as interfaces and defects. Otherwise, errors resulting from our approximations may grow rapidly, 
causing X to invade P, which will in turn require us to solve the complex equations everywhere. 
We will further assume that the decomposition algorithm is implemented after a sufficient time, 
when initial transients have passed, and that the crystals are evolving steadily, which implies that 
^j inside the crystals has reached some maximum saturation value vJ/J*"^. The scenario we have in 
mind is sketched in Fig. 5, with P constituting the shaded regions and all other regions correspond 
to X. 
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Liquid 




FIG. 5: Sketch illustrating the idea of selectively evolving the complex amplitude and phase/amplitude 
equations in different regions of the computational domain. and $j are evolved inside the shaded circles, 
which fall well inside the crystalline phase, while the real and imaginary components of Aj are evolved 
everywhere else. 

The pseudo-code shown in Algorithm 1 presents a simple algorithm to achieve this decomposi- 
tion. The algorithm first determines nodes with exceeding some minimum value ^\^J^"-^^ and 
|V^'j| beneath some limit ei. The nodes satisfying these conditions constitute domain P, while 
those failing to, constitute X. The P nodes are then checked again to see if the quantity |V(|V<I>j|)| 
is under some limit e2- Nodes in set P that fail to satisfy this condition are placed in set X. The 
parameters 7, ei, and €2 are chosen to ensure the largest possible size of set P. A small problem is 
caused by the fields and |V<I>j| not being perfectly monotonic. As the limits ei and €2 are sharp, 
several small islands (clusters of grid points) of X or P can be produced, which are detrimental 
to numerical stability. We have resolved this issue via a coarsening algorithm that eliminates very 
small clusters of X and P. 

Fig. 6 shows results from a uniform grid implementation of Algorithm 1. No islands are present, 
as the algorithm decomposes the domain in an unsupervised manner. It is noteworthy that domain 
boundaries are distorted in Figs. 6(c) and 6(d) in response to the formation of a grain boundary 
between the two crystals, after being roughly hexagonal at earlier times. The fact that the domain 
separatrices maintain a safe distance from the grain boundary ensures that the phase/amplitude 
equations are not evolved in regions containing sharp gradients in V^>j. Parameter values used 
were 7 = 0.85, ei = 0.0005, and 62 = 0.003. 

The remarkable feature of our numerical scheme is that solving different sets of equations in 
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Algorithm 1 (Color online) Domain decomposition. The parameters 7, ei, and €2 are heuristic. 
Compute ^^"^ 

3 ' j 

{Split domain based on the magnitude of ^'j and |V^'j|} 
for i = 1 to maxnode do {loop over all nodes} 
count = 

for j = 1 to 3 do {loop over amplitude components} 

if > and |V*j| < ei then 

count++ 

end if 
end for 

if count = 3 then 

domain = P {passed test, solve phase/amplitude equations} 
else 

domain = X {failed test, solve complex equations} 
end if 
end for 

{Split domain based on |V (|V$j|)|} 
for i = 1 to maxnode do {loop over all nodes} 
count = 

if domain — P then {check only nodes that passed previous test} 
for j = 1 to 3 do {loop over amplitude components} 
if |V(|V$j|)| < 62 then 

count++ 
end if 
end for 

if count 3 then 

domain = X {failed test, solve complex equations} 
end if 
end if 
end for 

X and P does not require doing anything special near the domain boundaries, such as creating 
"ghost" nodes outside each domain, or constraining solutions to match at the boundaries. Both 
sets of variables, and {Aj}, are maintained at all grid points irrespective of the domain 

they belong to, with one set allowing easy computation of the other [58]. Therefore the transi- 
tion between the two domains is a continuous one in terms of field variables, which allows the 
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{c)t = 280 {d)t = 360 

FIG. 6: (Color online) Filled contour plot showing the time evolution of three misoriented crystals. The 
field plotted is ^'3. Superimposed on the plot as solid curves are the boundaries that separate domains X 
and P, with P being enclosed by the curves. 

finite difference stencils in Eqs. (Al) and (A5) to be applied to the respective fields witliout any 
modification near domain boundaries. 

V. SOLVING THE RG EQUATIONS WITH ADAPTIVE MESH REFINEMENT 

As highlighted in earlier work [20, 46-48] the use of dynamic adaptive mesh refinement alters 
the numerical mesh resolution dynamically such as to place high resolution near phase boundaries 
and a very low resolution in bulk regions where there is little activity. This dramatically reduces 
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computer memory requirements, allowing larger systems to be simulated. It also significantly 
reduces overall simulation times. The determining factor guiding the use or otherwise of an AMR 
technique to solve a particular problem is the simple criterion 

Interface length (or Area) 
Domain area (or Volume) 

The phase/amplitude RG equations discussed above are precisely in the class of problems that 
can benefit from and is amenable to adaptive mesh refinement. Indeed, as will be shown below, 
the speedup in time contained intrinsically by the physics of the PFC equation is complemented 
by the concomitant bridging of length scales afforded by the RG equations solved adaptively. 

We solved the RG equations discussed above using a new C++ adaptive mesh refinement (AMR) 
algorithm that uses finite differences (FD) to resolve spatial gradients [49]. While it is typical to 
use the finite element method (FEM) in situations involving non-structured meshes, adaption using 
finite differences schemes allows approximately a 5-10 fold increase in simulation speed (measured 
as CPU time per node) over previous (FEM) formulations involving traditional phase field models 
[46, 47]. This improvement increases further still in cases where model equations contain spatial 
gradients or order higher than two, such as in the case of Eqs. (7), (16) and (17). The basic reason 
for the difference in speeds is that FEM formulations generally have more overhead due to their 
reliance on local matrix multiplication at multiple Gauss quadrature points. This overhead time 
becomes even more pronounced when using elements of order higher than two, as is required if an 
FEM formulation is to be used to resolve the spatial derivatives involved with the RG equations 
in this work. 



A. AMR Algorithm 

At the heart of our algorithm is a routine that creates a non-uniform mesh that increases nodal 
density in specific regions according to a local error estimator. Nodes are grouped into pseudo- 
elements, managed by dynamic tree data structures, as used in a finite element formulation by 
Provatas et al [47] . The quad-tree structure illustrated in Figure 7 is a hierarchy of elements where 
every level deeper in the tree results in elements of higher refinement. Every element has associated 
with it 4 corner 'nodes' and 5 'ghost' nodes: 4 in the center of each edge and 1 in the center of the 
element. The ghost nodes facilitate interpolation when neighboring elements are at different levels 
of refinement. Each node is a structure containing field values, such as phase, amplitude and the 
real and complex components of Aj. A node also contains information about its nearest neighbors. 
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Edge ghost nodes can serve as field nodes if a resolution mismatch occurs across neighbor elements. 
This can be seen in the schematic in figure 7. The field equations are not solved at ghost nodes, 
but are instead interpolated linearly from the nodal values of the element (or edge) to which they 
belong. The inclusion of ghosts nodes simplifies the calculation of local derivatives. 



GRfD 




ELEMENT DATA STRUCTURE 




FIG. 7: Schematic of a quad-tree data structure used in adaptive meshing. An element splits and creates 
'children' beneath it in the tree structure. Nodes are created at the corner of each element and 'ghost' nodes 
are placed in the center of elements and along edges that have no real nodes, to accommodate resolution 
mismatches. Ghost nodes are approximated by interpolation from the element to which they belong. 

An adapter algorithm refines / unrefines the tree structure by using a user-defined error estimator 
computed for each element. Refinement is done by bisection as shown in Figure 7, and unrefinement 
is done by fusing four child elements into their parent element. Once refinement /unrefinement of 
all elements is complete, the adapter produces an array of nodes, each of which is the center of a 
local grid. The equation solver accepts this array as input and then solves the equations at these 
nodal points. This method of node organization modularizes our algorithm and allows the solver 
and adapter to be separately parallelized. 

The mesh data structure contains nodes with detailed knowledge of their local neighbors, each 
of which exist at the center of a 5 x 5 uniform mesh (See Fig. 8). During adaptation, the data 
structure applies rules that either increase (if higher accuracy is needed) or decrease (to decrease 
memory requirements) the size of this local nodal mesh. Also during adaptation, the data structure 
and its associated elements and nodes respect the following six rules (3 applied to each element 
and 3 to each node). Rule 1 ensures mesh cohesion and maintains accuracy in the solution of the 
PDEs. 
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RULE 1: Neighboring elements can vary by at most one level 
RULE 2: All elements contain 9 nodes, real or ghost 

RULE 3: Element neighbors are all at the same level (Therefore elements may have NU LL as a 

neighbor) 

RULE 4: Node neighbors are all at the same level (Nodes will never have NULL as a neighbor, 
but may instead have a ghost as a neighbor) 

RULE 5: Each node is at the center of what is defined as a uniform mini-mesh 

RULE 6: Each node is assigned the resolution (Ax) of the most refined element attached to it. 

The adaptive process is controlled primarily at the tree level, but invokes function calls inside 
of the element and node structures. Its basic flow is illustrated in Algorithm 2. This process allows 
an element-by-element examination using recursion to maintain the rules above. Once the process 
of adaptation is complete the adapter creates an array of nodes each with the index of its neighbors 
in the array, which is then used to solve the equations before adapting again. We also note that 
element 'leaves' are stored by their resolution level in an array of element lists. Element resolution 
is restricted to vary by at most one level compared to the element neighbors (Rule 1). 

Algorithm 2 Regridding algorithm 

Delete the Ghost Pointer List 

while Some elements have split or unsplit do 

Check all elements for unsplitting criteria 

Check all elements for splitting criteria 
end while 

Update node neighbors 
Build Ghost List 

Set Ghost Averaging Information 

Element splitting (refinement) is the dominant process in the algorithm, taking precedence over 
element coarsening (i.e. fusing four children elements into their parent). Elements are searched 
one refinement level at a time, starting from the second highest level of resolution. Each element is 
considered for splitting using an error criterion computed for that element. If splitting is required, 
an element data structure it is pushed onto a stack, where its neighbors are subsequently checked 
against Rule 1. If splitting will violate Rule 1, the neighbors are recursively split until all refined 
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elements satisfy Rule 1. When an element is split, it and its updated neighbor elements generate 
new real and ghost nodes, as well as information about their neighbors. The splitting algorithm is 
illustrated in Algorithm 3. 

Algorithm 3 Splitting Algorithm 

for ElcnicntLcvcl=maxLcvcl-l to do 

while Level is not empty do 

if Element Splitting eriteria is met then 
PUSH Element onto stack 
while Stack is not empty do 

if Top of Stack's neighbors need splitting then 

Push Needed neighbors onto stack 
else 

Pop the Element from the Stack 
Create 4 new children elements 
Push children onto ElementLevel+1 
Add New Nodes to the Node List 
Remove former Ghosts from Ghost List 
end if 
end while 
end if 
end while 
end for 

The unrefinement algorithm starts at the lowest level of refinement. Again, Rule 1 above must 
be imposed. The elements are examined to see if a parent requires splitting. If it does not, the 
parent has its four child elements eliminated, assuming their relationship with the neighbors allows 
it (i.e. Rule 1). Recursive unsplitting of elements is not allowed. As in the case of refinement, 
during unrefinement, element neighbors and node neighbors are updated. The flow of unsplitting 
is shown in Algorithm 4. 

As discussed above, nodes follow rules 4, 5 and 6. This is enforced by the introduction of ghost 
nodes in elements and by maintaining rule 1. Rule 5 is maintained by creating a list of local nodes 
(ghost or real) and by maintaining rule 1 during splitting. Rule 6 determines which node neighbors 
are chosen and the local node spacing (Ax). A real node will contain the minimesh on which 
is applied the governing equations; a ghost node provides instructions on how to interpolate the 
values needed for real node calculations. 
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Algorithm 4 UnSplitting Algorithm 

for ElementLevel=niaxLevel-l to do 

while Level is not empty do 

if Element Splitting criteria is NOT met then 

if Element's Parent's Splitting criteria is NOT met then 
if Parent's Children Have NO Children then 

if Parent's neighbor's are Same level or One level Lower then 
ADD Parent to ElementLevel+1 
REMOVE Parent's Children from ElementLevel 
REMOVE Unneeded nodes from node list 
REMOVE Ghosts from Ghost list 
Center Node Becomes a Ghost 

Edge Nodes Become Ghosts depending on neighbor levels 
DELETE Children 
end if 
end if 
end if 
end if 
end while 
end for 



B. Handling of Ghost Nodes in The Hybrid Formulation 

A very useful feature of the AMR implementation outlined in the previous section is that each 
node sits at the center of a uniform 5x5 mini- grid, illustrated in Fig. 8. Let us focus on the node 
represented by the lightly shaded circle (labeled F). The data structure ensures that node F has 
access to all other nodes on the wireframe. The advantage of this construction is that it allows 
us to use the uniform grid finite difference stencils for the Laplacian and gradient operators in 
Eqs. (Al) and (A5) respectively, instead of modifying them node-wise to accommodate variations 
in grid spacing. This requires the introduction of ghost nodes, shown as open circles in Figure 8. 

The scheme used to interpolate values at the ghost nodes is a potential source of error in the 
numerical solution, and must be chosen carefully, ^j, d^j/dx, d^j/dy, ^{Aj) and 9(>lj) are very 
smoothly varying functions, and therefore we linearly interpolate their values to the ghost nodes. 
Values at ghosts residing on element [59] edges, for example node M in Fig. 8, are obtained by 
averaging values of the two end nodes Q and S, whereas values at ghosts residing at the center of an 
element, node N for example, are obtained by averaging the values at the four corner nodes, P, Q, 
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FIG. 8: Schematic showing a portion of the adaptive grid where the refinement level changes. Filled circles 
(and node F) are real nodes where the fields are computed, whereas the open circles are non-computational 
ghost nodes where the fields are interpolated. 

R, and S. We have found this interpolation scheme to be quite stable. We note however that, given 
the near-periodic variations in ?R-{Aj) and 9(Aj), especially in misoriented grains, higher order 
interpolation functions (such as cubic splines) could improve solution accuracy, while strongly 
enforcing continuity of fields across elements. This issue will be examined in future work. 

The interpolation of $j at the ghost nodes is a little more delicate. Since $j is a discontinuous 
function, a simple average of the values at the neighboring real nodes may not always give the 
correct answer, especially because the grid does not resolve the discontinuities. Even if it did, 
a simple average could lead to the wrong result. As an example, consider the two real nodes Q 
and S in Fig. 8, with values <I>^ = tt — 6i and = — vr + 62 where 61 and 62 are very small but 
positive real numbers, on either side of a discontinuity in $1. We wish to determine the value at 
the ghost node M that lies between Q and S. Although the values of $1 at Q and S are essentially 
equivalent in phase space, differing in magnitude by approximately 27r, a simple average gives 
$^ = ((^2 — 5i)/2 0, which is quite wrong. 

In order to interpolate correctly we need to make use of V$j. For example in the above case, 
the total change in the phase from Q to S is obtained by integrating the directional derivative of 
$1 along the edge QS, i.e. 



Eq. (20) can be evaluated numerically, and the accuracy of the result depends on how well d^i/dy 




(20) 
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is approximated. Consistent with our earlier assumptions, we approximate d^i/dy as piecewise 
constant where 



which leads to 



1 / d^i 
2 
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dy 



s dy 
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Since d^i/dy is constant along the edge QS, $i must vary linearly along QS. Hence at node M, 
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$f = + ^A$f^ : $f G [-7r,7r: 



(23) 



Interpolation of $j at element center ghost nodes, such as N, is done in a similar manner by 
interpolating linearly from ghost nodes at the centers of opposite element edges. Once again, 
this scheme might be improved by choosing higher order polynomials to approximate inside 
elements. 



C. Refinement Criteria in the Hybrid Formulation 

Traditionally, AMR algorithms rely on some kind of local error estimation procedure to provide a 
criterion for grid refinement. Zicnkicwicz and Zhu [50] developed a simple scheme for finite element 
discretization of elliptic and parabolic PDEs by computing the error in the gradients of the fields 
using higher order interpolation functions. Berger and Oliger [51] on the other hand estimated the 
local truncation error of their finite difference discretization of hyperbolic PDEs via Richardson 
extrapolation. Depending on the equations being solved and the numerical methods being used, one 
scheme may be more effective than another. We use a very simple and computationally inexpensive 
refinement criterion that works nicely for our equations, based purely on gradients in the various 
fields. 

The outline of the algorithm used to decide whether or not to split an element is given in 
Algorithm 5. The algorithm initially computes absolute changes in the real and imaginary parts 
of Aj, and the x and y components of V$j in each element. We use absolute differences in place 
of derivatives in order for the refinement criterion to be independent of element size. We note that 
prior to implementing this algorithm, the domain decomposition algorithm described above (see 
Figure 1) needs to be called first in order to split the computational domain into subdomains X 
and P. 



25 



The process begins by examining an element flag to see if the element lies on the separatrix 
between X and P, or in k layers from the boundary, within the P subdomain. If so this element is 
split. This ensures that the fields are always resolved on the interface between X and P, and just 
within the boundary on the P side. The latter is required because of the higher order derivative 
operations that need to be performed while evolving the complex amplitude equations in X. 

If the element does not split and belongs to X (where Aj are the field variables) , the variations 
in the real and imaginary parts of Aj are checked to see if they exceed a certain bound ei . If any 
one of them does, this element is split. If, on the other hand, the element belongs to P where the 
phase/amplitude equations are solved, variations in the x and y components of V$j are checked 
to see if they exceed another limit 62- If they do, this element is split. If none of the above 
criteria are satisfied, the element is not split and is placed in the list of elements to be checked for 
coarsening. Since refinement criteria are recursively applied to the quadtree, the finest elements 
are automatically placed around domain separatrices, solid/liquid interfaces, and defects. 

VI. RESULTS AND COMPUTATIONAL EFFICIENCY 

Using the various approximations and algorithms described in the previous sections we solved 
the phase/amplitude and complex equations simultaneously in different parts of our computational 
domain using adaptive mesh refinement. Algorithm 6 shows the flow of control in the main routine. 
The complex amplitude equations, Eq. (7), are initially evolved everywhere until time Nfr, when 
initial transients have dissipated, and the crystals evolve steadily outward. The domain is then 
split into subdomains X and P, following which the reduced phase/amplitude equations, Eqs. (16) 
and (17), are evolved using a forward Euler time stepping scheme in subdomain P. The grid is 
refined after a predetermined number of time steps N adapts which is chosen heuristically. We note 
that the current implementation can handle only periodic boundary conditions. Work is currently 
underway to enable handling of more general boundary conditions. 

Using this implementation, we simulated the same problem (same initial and boundary condi- 
tions and problem parameters) that was solved adaptively in section II B using only the complex 
amplitude equations. Figure 9 shows the crystal boundaries and grid structure at various times 
during the simulation. Ntr was chosen to be 3000 for this simulation. With At = 0.04, this implies 
that this simulation is identical to the previous one until t = Ntr x At = 120. Thus, Figs. 9(a) and 
9(b) are identical to Figs. 2(a) and 2(b). The advantage of the hybrid implementation starts to 
appear from Fig. 9(c), whenceforth, unlike in Fig. 2, even grains that are misoriented with respect 
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FIG. 9: (Color online) Evolution of a polycrystalline film simulated with Eq. (7), and Eqs. (16) and (17), 
using our adaptive mesh refinement algorithm. The conditions in this simulation are identical to those in 
section II B and Fig. 2. Note that the grid now coarsens inside grains that arc misoriented with respect 
to kj , and "beats" are no longer a limitation. The colored field plotted is the average amplitude modulus, 
which is "red" inside the crystal phase, "blue" in the liquid phase, "green" at the crystal/liquid interface, 
and "yellow" near defects. 

to the basis kj show grid unrefinement within. It is also noteworthy that the grid remains refined 
near solid/liquid interfaces, grain boundaries and defects, ensuring that key topological features 
are correctly resolved. 

We now compare solutions from the two simulations quantitatively. We find it more informative 
to make a pointwise comparison of the two solutions along cross sections of the domain, rather than 
comparing solution norms, as we believe that this is a more stringent test of our implementation. 
We choose two random cuts, one running parallel to the y axis at Xcut = 70tt, and the other 
parallel to the x axis at ycut = llSvr. The solutions are compared along these cuts at two different 
times, t = 168 and t = 552 in Figs. 10 and 11 respectively. The solid curves in the figures (labeled 
"hybrid") are variations in ^'i and d^i/dx along the entire length of the domain as computed with 
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Algorithm 5 Criteria for element splitting 



{Nl, N2, N3 and N4 are the element nodes in clockwise manner} 
for i = 1 to 3 do {loop over amplitude components} 

{Change in real part of Ai over clement} 

DR, = |N1 ^ n{A,) - N2 ^ n{A,)\ + |N2 ^ 3?(A,) - N3 ^ ^{Ai)\ 
+ |N3 ^ 3?(A,) - N4 ^ ?R{Ai)\ + |N4 ^ ?R{Ai) - Nl ^ ^{Ai)\ 
{Change in imaginary part of Ai over element} 
Dli = |N1 - N2 ^ + |N2 ^ - N3 ^ 3(^01 

+ |N3 ^ - N4 9(^01 + |N4 ^ - Nl ^ 9(^01 

{Change in x component of V$i over element} 

DGPx^ = |Ni ^ - N2 + |n2 ^ - n3 ^ 

+ |N3 ^ - N4 ^ d^i/dx\ + |N4 ^ a$i/aa; - Nl ^ d^i/dx\ 

{Change in y component of V$i over element} 

DGPY, = |N1 ^ - N2 ^ d^i/dy\ + |N2 ^ d^i/dy - N3 ^ d<^>i/dy\ 

+ |N3 d<^i/dy - N4 ^ + |N4 ^ a^i/^y - Nl ^ S^i/ay] 

end for 

if element on X/P boundary OR k layers inside P then 

Split element and exit 
else if element inside X then 

count =0 

for i = 1 to 3 do {loop over amplitude components} 

if DRi > ei OR Dli > ei then 
count ++ 

end if 
end for 

if count ^ then 

Split clement and exit 
end if 
else {element is inside P} 
count = 

for i = 1 to 3 do {loop over amplitude components} 

if DGPX, > £2 OR DGPYi > ca then 
count++ 

end if 
end for 

if comit 7^ then 

Split element and exit 
end if 
end if 
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Algorithm 6 Flow of control 

InitVar() {Initialize program variables and parameters} 

InitGrid-O {decide where to initially refine/coarsen based on initial condition} 
UpdateGhostsFV() {interpolate Aj, ^j, ad, ghost nodes} 
ComputePhaseGradients(l) {compute V^j everywhere} 
UpdateGhostsPG() {interpolate V$j at ghost nodes} 
for i = 1 to Ntr — 1 do {evolve until initial transients subside} 

if i mod Nadapt — OR i = 1 then 
AdaptGrid() {uses Algorithm 5} 

end if 

EvolveComplexAmpO {evolve Eq. (7) everywhere} 
UpdateAllFields() {compute "^j and $j from Aj} 
UpdatcGhostsFVO 
ComputePhaseGradicnts(l) 
UpdateGhostsPGO 
end for 

DivideDomain() {call Algorithm 1 to split domain into X and P} 
for i = Ntr to Nend do {evolve after transients subside} 
if i mod Nadapt = then 
DivideDomain() 
AdaptGridO 
end if 

EvolveComplexAmpO {evolve Eq. (7) in X} 
EvolvePhascAmpO {evolve Eqs. (16) and (17) in P} 
UpdateAllFields() {evaluate and $j in X, evaluate Aj in P} 
UpdatcGhostsFVO 

ComputePhascGradicnts(i) {compute in X only, frozen gradient approx.} 
UpdateGhostsPGO 
end for 

the current ("hybrid") implementation, whereas the symbols (labeled "complex") are variations 
in the same variables as computed using fully complex equations (section II B). The agreement 
is essentially perfect, indicating that our simplifications based on approximations in the preceding 
sections work reasonably well. 

Because the performance of our algorithm is sensitively tied to the type of problem that is being 
solved, it is difficult to come up with a simple metric that quantifies its computational efficiency. 
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(a) t = 168 (b) t = 552 



FIG. 10: (Color online) Numerical solution along the line x — TOtt in Fig. 9 compared to the results using 
the hybrid scheme. Some of the data points in the complex solution were omitted for clarity of presentation. 




(a) t = 168 (b) t = 552 

FIG. 11: (Color online) Numerical solution along the line y — IIStt in Fig. 9 compared to the results using 
the hybrid scheme. Some of the data points in the complex solution were omitted for clarity of presentation. 

The difficulty lies in accounting for the change in CPU time per time step, which increases with the 
number of mesh points. For example, Fig. 12 shows the number of nodes in this simulation over 
time. Clearly, an adaptive grid implementation has a significant computational advantage over an 
equivalent fixed grid implementation at the early stages of the simulation. 
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FIG. 12: (Color online) Number of computational nodes in the grid as a function of time, for simulations 
in Fig. 2 (black curve) and Fig. 9 (red curve). The number of nodes reaches a constant value after all the 
liquid freezes. The dashed line shows the number of nodes required by a uniform grid implementation of 
the complex amplitude equations for the same problem. 




One performance measure is the projected speed of our implementation compared to a uniform 
grid implementation of the PFC equation. This speedup is estimated by the simple formula, 

S^Nppc^^MnG-AG^l^^ (24) 

Nrg-AG ^tpFC Q ^ ' 

where Nppc is the number of grid points required to solve the PFC equation, Nrg~ag is the 
number of grid points required in a hybrid implementation of the amplitude/RG equations, Atppc 
and AtpG-AG are the time steps used in the respective implementations, the factor 1/6 comes from 
solving six RG equations in place of the [one] PFC equation directly, and (3 £ [0, 1] is the overhead 
of the AMR algorithm. The difficulty lies in fixing N^q^aq which is constantly changing with 
time. One estimate for N^q^aq is the number of nodes averaged over the entire simulation. This 
can be computed easily by dividing the area under the hybrid curve in Fig. 12 by the total number 
of time steps taken, which gives Npc-AC = 104, 747. Further, based on heuristics collected while 
running our code, we conservatively estimate mesh refinement /coarsening to constitute about 3% 
of the CPU time, which gives /3 = 0.97. Therefore, from Eq. (24) we have 

^ 1,050,625 0.04 1 

S = ' ^ ' „ X X - X 0.97 = 8.1. (25) 

104,747 0.008 6 ^ ^ 

We do recognize that for a more accurate estimate of S we would also need to consider overhead 
costs that may come from sub-optimal cache and memory usage owing to the data structures used. 
Hence these numbers should only be considered as rough estimates of true speedup. 
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While a speedup factor of 8 may not seem to be a great improvement in computational efficiency, 
one sliould bear in mind that the number of nodes in the AMR algorithm scales (roughly) linearly 
with interface/grain boundary length, which is quite substantial in the system we just simulated. 
Thus, one should not expect to derive the maximum computational benefit when simulating small 
systems with large numbers of grains. On the other hand, with this new method, we can now 
simulate the growth of a few crystals in a much larger system. We choose a square domain of side 
40967r, which in physical dimensions translates to 0.722 /xm, if we assume an interatomic spacing of 
4 A [60]. We initiate three randomly oriented crystals, two a little closer together than the third, 
so that a grain boundary forms quickly. The crystals are shown at different times in Fig. 13. The 
simulation was terminated at t = 3960 when memory requirements exceeded 1 GB, after running 
on a dedicated 3.06 GHz Intel Xeon processor for about one week. 

Let us calculate the speedup factor for this simulation as we did previously, after 70, 000 time 
steps (t = 2800, Fig. 13(f)). Fig. 14 shows that the number of nodes in the adaptive grid varies 
nearly linearly with the number of time steps, and we estimate the average number of nodes 
Nag-RG to be 200,721. The same simulation on a uniform grid using the PFC equation would 
have required 268,435,456 nodes (not possible on our computers). We estimate /? = 0.98. In this 
case the speedup is about three orders of magnitude, 

^ 268,435,456 0.04 1 ^„ 

S = ' ' — X TTT^T^TT X X 0.98 = 1091. (26) 
200,721 0.008 6 ^ ' 

Fig. 15 shows vividly the range of length scales from nanometers to microns spanned by our grid 
in this simulation, highlighting its "multiscalc" capability. 

We would like to emphasize that as with any adaptive grid implementation, refinement criteria 
can change 5 by a factor that is approximately constant. In order to enable testing our imple- 
mentation on a much larger domain subject to the available memory resources, the criteria were 
relaxed. Note however, that even if we had roughly doubled the number of finely spaced nodes 
near the interfaces and the grain boundary, which would lead to a significantly more accurate 
calculation, S would still be about 500 times faster than an equivalent implementation of the PFC 
equation on a uniform grid. 



VII. CONCLUDING REMARKS 



In this article, we have presented an efficient hybrid numerical implementation that combines 
Cartesian and polar representations of the complex amplitude with adaptive mesh refinement, and 
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(d)t=1620 (e)t=2080 (f)t=2800 



FIG. 13: (Color online) Micro-scale simulation of two dimensional crystal growth with amplitude equations 
using AMR. 

allows the modeling capabilities of the PFC equation to be extended to microscopic length scales. 
Depending on the choice of application, we have shown that our scheme can be anywhere from 
1 — 3 orders of magnitude times faster than an equivalent uniform grid implementation of the PFC 
equation, on a single processor machine. We anticipate that this advantage will be preserved when 
both implementations are migrated to a parallel computer, which is an important next step required 
to give the RG extension of the PFC model full access to micro- and meso- scale phenomena. 

In conclusion, we have shown that multiscale modeling of complex polycrystalline materials 
microstructure is possible using a combination of continuum modeling at the nanoscale using the 
PFC model, RG and related techniques from spatially-extended dynamical systems theory and 
adaptive mesh refinement. 

We regard this work as only a first step for our modeling approach with the RG extension 
of the PFC to be successfully applied for studying important engineering and materials science 
applications. We have identified a few issues that require immediate attention. The first, although 
an implementation issue, is critical, and has to do with using amplitude equations for applications 
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FIG. 14: Number of computational nodes in the grid as a function of time for the 1 /xm x l^um domain. 
The growth is almost linear. 

involving externally applied loads and displacements to a polycrystal that has been evolved with 
our equations. Simple applications could be, subjecting the polycrystal to shear, uniaxial, or 
biaxial loading states [32, 33]. Such boundary conditions are difficult enough to apply to the scalar 
field in the PFC equation [34]. Meaningful translation to equivalent boundary conditions on 
the amplitudes and phases of ip can be a very difficult task, requiring the solution of systems of 
nonlinearly coupled equations at the boundaries.We have not yet investigated this issue in any 
detail. 

Our derivation of the amplitude equations [40] was based on a one mode approximation to the 
triangular lattice, and as we always chose parameters fairly close to the boundary between the 
triangular phase and coexisting triangular and constant phases, i.e. jr + 3tp^\ ^ 1, the amplitude 
equations we derived were within their domain of validity and our results were quite accurate. 
It is almost certain that a onc-modc approximation will not give similarly accurate results when 
|r + 3-0^1 ~ 0(1) (although it would be interesting to see how much the error actually is). It is 
not clear if this in any way precludes certain phenomena from being studied with our equations, 
as we can always choose parameters to stay in the regime where the one-mode approximation is 
valid, but if it does, amplitude equations for dominant higher modes need to be systematically 
developed. 

An important assumption made in the derivation of our so called "hybrid" formulation of 
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FIG. 15: (Color online) The above grid spans roughly three orders of magnitude in length scales, from a 
nanometer up to a micron. The leftmost box resolves the entire computational domain whereas the rightmost 
resolves dislocations at the atomic scale. 

the complex amplitude equations is that of locally freezing the phase gradient vector V<&j. In 
fact, it is this assumption that allows us to effectively unrefine the interior of grains and gain 
significant speedup over the PFC equation. If for example, the problem we are studying involves 
the application of a large external shear strain that could change V<I>j in the grain interior via 
grain rotation, it is uncertain whether our algorithm would continue to maintain its computational 
efficiency over the PFC. This is again a matter worth investigating. 
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APPENDIX A: DISCRETIZATION OF OPERATORS 

1. Laplacian 

The Laplacian of a function f{x,y) is discrctized at point {xi,yj) = (iAx,jAx) using a nine 
point finite difference stencil as shown below, where Ax is the mesh spacing. 

y2^| ^ fi+l,j + fi-l-j + fi,j+l + fi,j-l I fi+l,j+l + fi-l,j-l + fi-l,j+l + fi+l,j-l 

A Fourier transform of this isotropic discretization, described by Tomita in [52], is shown to very 
nearly follow the —k^ isocontours. 

2. Gradient 

The gradient of a function /(x, y) is discretized at point (xj, yj) = {iAx, jAx) using a nine point 
second order finite difference stencil as shown below, where Ax is the mesh spacing. The stencil 
is designed to minimize effects of grid anisotropy which can introduce artifacts in the solution, 
especially on adaptive grids. We have 

V/l,,,- = Vef .+0{Ax') 

= { ^'%Ax~''' ) ^+ { ^''^2Ax'~' ) ^^^^ 

But 



fx + fy\(i+j\ , f -fx + fy\ I -i + j 



and hence we also have 

V/L- = V^/.. + 0(Ax2) 



fi+l,j+l - fi~l,j-l\ ( i + j\ . f fi-l,j+l - fi+l,j-l\ ( -i + j 



2V2AX J \ V2 J V 2V2AX J \ V2 J 

fi+l,j+l - fi~l,j-l - fi-l,j+l + fi+l,j-l \ ^ 

4Ax J 

Using the discrete forms for the gradient in Eqs. (A2) and (A4) we can write the isotropic second 
order discretization as 

V/li,,- = U Ve/ . . + V^/ . .1 + 0{Ax^). (A5) 



i,3 
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A discretization scheme similar to Eq. (A5) is given by Sethian and Strain [53]. 
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